Collect information on ongoing education during a year
Information about ongoing education (so-called course data) is available with courses as unit level (with associated course identifier). Courses are given by the combination of person x course type, where each individual can in practice be represented by several course types at the same time.
Since course data does not have a person as a unit, these cannot be imported into personal datasets in the usual way, but must be connected using the merge
command.
First, you have to add a link between the course ID and the person ID on the course data. You then have to aggregate up to person level using the command collapse
before finally connecting to the personal dataset.
In the example, a personal dataset is first created consisting of people residing in Norway (regstatus == '1'
) as of 2021-01-01. A history of ongoing education is then retrieved for the entire year 2021, where you select for education at a higher level (master’s or higher, level 7 and 8). The command collapse(count)
is used to count the number of observations with ongoing education per individual over the year 2021, and the result is then linked to the person dataset for further analysis.
NB! Note that the kurstype (course type) variable after using collapse
will consist of values for the relevant statistic being run, in this case the number of observations (count
).
//Connect to datastore
require no.ssb.fdb:30 as db
//Create individual level dataset containing residents in Norway per 2021-01-01
create-dataset persondata
import db/BEFOLKNING_KJOENN as gender
import db/BEFOLKNING_STATUSKODE 2020-01-01 as regstatus
keep if regstatus == '1'
//Retrieve people who are studying for masters degree or higher during 2020
create-dataset coursedata
import-event db/NUDB_KURS_NUS 2020-01-01 to 2020-12-31 as coursetype
destring coursetype
keep if coursetype >= 700000 & coursetype < 900000
//Add link between courseid and personal id number
create-dataset link_course_person
import db/NUDB_KURS_FNR as idnr
merge idnr into coursedata
//Create statistics (collapse) over number of events containing higher degree education per individual, and link this into the person dataset
use coursedata
collapse (count) coursetype -> courses, by(idnr)
merge courses into persondata
//Produce table counting number of people studying higher degree during 2020
use persondata
generate edu_high = 0
replace edu_high = 1 if courses >= 1
tabulate edu_high gender